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PREFACE 


The  "topics  in  mathematical  programming"  that  constituted  our 
research  under  this  contract  are  those  of  Nondif ferentiable 
Optimization  (NDO)  and  Integer  Programming,  the  specialties  of 
the  two  principal  investigators.  They  are  dealt  with  below  under 
separate  headings. 


RESEARCH  IN  NONDIFFERENTIABLE  OPTIMIZATION 


1.  The  I IAS A Meeting 

A meeting,  titled  "Task  Force  on  Nondif ferentiable  Optimiza- 
tion”, was  held  at  the  International  Institute  for  Applied 
Systems  Analysis  (IIASA)  in  Laxenburg,  Austria,  on  March  28  - 
April  8,  1977.  It  had  its  origins  in  the  identification  of  the 
subject  of  Nondif ferentiable  Optimization  (NDO)  through  the 
publication  of  the  volume  Nondif ferentiable  Optimization&l  and 
was  organized  by  C.  Lemarechal  and  R.  Mifflin  of  IIASA,  with 
the  counsel  of  Balinski  and  Wolfe.  In  the  format  of  a small 
workshop,  it  was  attended  by  R.  Fletcher  (U.K.),  J.  Gauvin 
(Canada),  J.-L.  Goffin  (Canada),  Lemarechal,  R.  Marsten  (USA), 
Mifflin,  B.T.  Polyak  (USSR) , B.N.  Pschenichnyi  (USSR) , and  Wolfe. 
Each  day  was  devoted  to  the  work  of  one  of  the  attendees,  who 
presented  it  in  the  form  of  a lecture  and  then  discussed  it  in 
detail  with  the  group.  We  all  found  the  exchange  of  great  value. 
Of  particular  interest  to  our  own  work  was  Marsten' s detailed 
report  on  his  "Boxstep"  method,  Polyak's  summary  of  the  work  of 
N.Z.  Shor  (the  earliest  user  of  subgradient  optimization  in  real 
optimization  problems),  and  Lemarechal 's  report  on  computational 
experience  with  subgradient  optimization,  Wolfe's  and  his  own 
versions  of  "conjugate  descent",  and  Shor's  recent  "dilatation" 
method.  Lemarechal  reported  that  on  the  several  problems  tried 
the  methods  ranked  in  efficiency  in  the  order  just  listed.  (His 
observation  is  supported  by  our  own  experiments . ) and  indicate 
what  progress  we  made  in  them. 

2.  fication  of  conjugate  descent  methods 

< compared  in  detail  the  conjugate  descent  procedures  of 
L e/  . :hal  and  Wolfe,  two  closely  related  methods  which  can  be 

vi  as  extensions  of  the  method  of  conjugate  gradients  (used 
t>  e minimization  of  smooth  functions)  to  nondif ferentiable 
1 .ions.  Both  procedures  make  essential  use  of  the  accumula- 
i of  a "bundle"  of  previously  calculated  subgradients  which  is 


*1  M.L.  Balinski  and  P.  Wolfe.  Nondif ferentiable  Optimization. 
Mathematical  Programming  Study  3.  North-Holland  Publishing 
Company,  Amsterdam,  1976. 
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used  to  determine  a direction  of  descent.  The  methods  differ  in 
the  ordering  of  certain  "resetting"  steps  that  both  algorithms 
perform.  We  wrote  a set  of  computer  routines  in  which  such 
variants  can  readily  be  tried,  and  found  almost  no  difference  in 
performance  between  the  two  methods  when  they  were  used  in  the 
same  context.  Henceforth  we  can  speak  with  some  justification  of 
the  conjugate  descent  method  as  a general  procedure  (involving 
conjugation  via  a "nearest  point  calculation",  and  a line  search) 
of  which  the  particular  schemes  mentioned  — and  others  — are 
variant  implementations . 

3.  Constraints  in  conjugate  descent 

We  devised  a means  for  using  conjugate  descent  for 
constrained  optimization  problems.  For  the  problem 

Min  f (x)  : g(x)  0 

(we  take  a single  constraint  for  simplicity) , we  minimize 

F (x)  « f(x)  + K Max  0 , g (x) 

for  sufficiently  large  K . If  the  point  x is  on  the  boundary, 
and  b,c  are  respective  subgradients  of  f,g  there,  then  b+Kc 
is  a subgradient  of  F.  We  wish,  of  course,  to  make  use  of  such 
a subgradient  when  -g  itself  is  not  a feasible  direction.  The 
new  procedure  is  based  on  the  observation  that  if  both  b and  b+Kc 
are  members  of  the  bundle  of  subgradients  used  to  define  the  new 
descent  direction  at  x then  for  sufficiently  large  K that  direc- 
tion will  be  the  projection  of  -b  onto  the  tangent  plane  to  the 
constraint;  in  our  view,  the  most  desirable  outcome.  The  idea 
works  for  any  number  of  constraints.  We  have  tested  it  on  some 
small-scale  problems  with  linear  constraints,  and  it  seems  to  be 
quite  effective.  The  report  "Constraints  in  Conjugate  Descent" 
describing  the  procedure  is  in  preparation. 

4.  Univariate  optimization 

Finding  the  minimum  of  a convex  function  of  a single  varia- 
ble (to  a specified  degree  of  approximation)  is  an  essential 
subtask  of  an  efficient  conjugate  descent  procedure.  Our  proce- 
dure for  the  piecewise-linear  convex  function  has  been  further 
polished,  exercised  on  a variety  of  problems,  and  seems  to  be 
foolproof.  (It  was  the  subject  of  IBM  Invention  Disclosure 
Y08-77-0033,  January,  1977.)  It  has  been  extended  to  general 
convex  functions  using  some  of  the  ideas  of  our  previous  algor- 
ithm for  smooth  functions,  as  reported  in  "Minimization  of 
nonsmooth  univariate  functions"  (see  Publications . ) The  proce- 
dure is  extraordinarily  robust  and,  we  think,  efficient,  although 
there  is  no  question  that  the  piecewise-linear  minimizer  should 
be  used  when  a problem  is  known  to  have  that  character. 
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5 . Benchmark  problems 

In  collaboration  with  Lemarechal  and  Mifflin  (while  they 
were  at  IIASA)  and  Goffin  at  the  University  of  Montreal  we  have 
defined  a set  of  four  (at  the  moment)  test  problems  that  we  agree 
are  representative  and  important  and  will  serve  as  a basis  for 
comparing  results.  They  will  be  maintained  in  machine-readable 
form  and  are  available  to  anyone. 

6.  The  solution  of  large  systems  of  linear  inequalities 

The  1957  work  of  Agmon  and  of  Motzkin  and  Schoenberg  on 
"relaxation"  methods  for  the  solution  of  linear  inequality 
systems  was  the  first  application  of  what  we  now  call  "subgra- 
dient optimization"  to  mathematical  programming . It  did  not  have 
much  impact  at  the  time  because  the  simplex  method  proved  more 
effective  for  the  small  problems  of  that  era.  Our  first  work 
specifically  aimed  at  NDO  problems 

reestablished  the  importance  of  that  approach  for  large-scale 
optimization  generally;  subsequently,  a number  of  important 
applications  emerged  which  can  be  modeled  as  requiring  the 
approximate  solution  of  systems  of  many  thousands  of  (very 
sparse)  linear  inequalities  in  many  thousands  of  variables: 
electron-beam  photolithography  and  x-ray  tomography  are  two  such 
applications  we  have  worked  on.  Such  problems  seriously  tax,  or 
even  exceed,  the  capabilities  of  any  current  implementation  of 
the  simplex  method,  which  further  can  benefit  but  little  from  the 
fact  that  in  many  cases  only  approximate  solutions  are  needed. 

Our  work  has  been  directed  both  at  seeking  to  understand  what 
problem  features  are  important  for  successful  use  of  the  method 
and  at  improving  on  standard  implementations.  We  have  experi- 
mented with  (1)  randomly  generated  problems  of  a certain  type, 

(2)  the  constraint  sets  defined  by  some  standard  small  and 
medium-size  linear  programming  problems,  and  (3)  some  small 
models  of  the  E-beam  photolithography  problem.  Quite  unexpected- 
ly, for  random  problems  with  a given  number  of  variables,  the 
method  converged  more  rapidly  for  problems  with  many  inequalities 
than  for  few.  We  now  think  we  can  explain  that:  the  degree  to 
which  the  set  defined  by  the  inequalities  "approximates"  a sphere 
increases  with  the  number  of  inequalities,  and  the  method  can  be 
shown  to  solve  such  "spherical”  problems  with  nearly  perfect 
efficiency.  We  have  devised  a simple  estimation  procedure  which 
can  eliminate  the  need  for  calcuating  many  of  the  inequalities  in 
some  large  systems  (this  is  the  subject  of  IBM  Invention  Disclo- 
sure Y08-77-0202,  April,  1977;  a research  report  on  this  is  in 
preparation) . We  have  also  proved  the  feasibility  of  a certain 
method  of  efficiently  handling  inequalities  in  small  "blocks”, 
making  use  of  our  very  quick  algorithm  for  finding  the  nearest 
point  in  a given  polyhedron  to  a given  eternal  point  In  our  test 


M.  Held,  P.  Wolfe,  and  H.P.  Crowder:  "Validation  of  Subgra- 
dient Optimization",  Mathematical  Programming  6 (1974),  62-88) 
P.  Wolfe.  Finding  the  Nearest  Point  in  a Poly tope.  Mathemat- 
ical Programming  11  (1976)  128-149. 


problems  these  techniques  have  shown  an  order-of-magnitude 
improvement  in  computing  time  as  compared  with  our  implementation 
of  "standard"  methods. 
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7 . Bibliography 

We  have  continued  the  maintainance  of  a Bibliography  of 
papers  related  to  NDO  which  now  runs  to  some  260  items. 


Summary  of  the  two-year  effort  in  NDO 

We  began  our  work  concentrating  on  the  exploitation  of 
conjugate  descent  methods,  which  at  the  time  seem  to  hold  the 
greatest  promise  for  efficient  solution  of  general  NDO  problems, 
and  we  feel  that  their  success  has  been  demonstrated  for  problems 
of  middling  size  when  calculation  of  the  function  being  minimized 
is  expensive. 

Unfortunately,  the  amount  of  work  per  step  required  by  such 
methods  increases  at  least  linearly  with  the  number  of  variables 
in  the  problem,  and  they  do  not  seem  feasible  for  the  very 
interesting  problems  mentioned  above  involving  thousands  of 
variables.  Subgradient  optimization  shows  little  dependence  on 
number  of  of  variables;  its  speed  of  convergence  depends  much 
more  on  specific  features  of  the  problem  we  are  just  beginning  to 
understand.  We  do  know,  though,  that  the  calculations  required 
for  it  can  be  performed  with  great  economy  in  the  typical  case 
that  the  data  of  the  problem  are  very  sparse.  Our  work  so  far 
has  shown  that  large  improvements  in  the  procedure  can  be  made 
using  rather  simple  devices,  and  we  think  that  there  is  a great 
deal  more  along  that  line  to  exploit. 
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RESEARCH  IN  INTEGER  PROGRAMMING 


1.  Subadditive  Functions  for  Describing  Facets  of  Additive 
Systems 

The  thrust  of  the  work  here  has  been  to  obtain  a central 
theory  offering  some  description  of  polyhedra  for  pure  integer 
programming  problems.  The  subadditive  function  characterization 
' of  facets  has  been  extended  to  a very  general  combinatorial 

optimization  problem  involving  an  additive  system.  The  paper, 

"On  the  Generality  of  the  Subadditive  Characterization  of 
Facets",  describes  this  work.  The  theory  encompasses,  in  the 
same  framework,  antiblocking  (or  packing)  type  problems,  as  well 
as  blocking,  so  that  trere  is  no  necessity,  as  in  Araoz,  for  a 
separate  development  employing  superadditive  functions. 

The  increased  generality  in  treating  semigroups  not  only 
takes  us  away  from  the  group  relaxation  in  allowing  us  to  direct- 
ly represent  integer  programming  problems,  but  allows  us  to 
represent  problems  of  a noncommutative  nature  without  having  to 
write  down  a linear  programming  relaxation,  which  may  be  awkward. 

This  work  also  raises  a whole  set  of  questions  as  to  what 
results  from  Gomory’s  original  group  paper  carry  over  to  this 
more  general  framework.  The  mapping  notion  there,  for  deriving 
facets  of  a problem  from  its  subgroups,  can  be  generalized  to 
give  results  relating  facets  for  different  additive  systems.  For 
example,  for  a non-Abelian  finite  group,  simply  requiring  commu- 
tativity as  a relation  gives  a mapping  T on  an  Abelian  group  such 
that  T(g+h)  ■ T(g)+T(h)  for  all  g,h  in  the  non-Abelian  group. 
Thus,  facets  for  the  Abelian  group  give  valid  inequalities  and, 
in  many  cases,  facets,  for  the  non-Abelian  problem. 

2.  Algorithmic  implications 

The  theory  developed  provides  a framework  into  which  compu- 
tational methods  can  be  placed.  For  one  special  case,  the 
knapsack  problem,  we  have  indicated  how  traditional  methods  fit 
into  this  framework.  However,  the  "lifting  methods",  using 
subadditive  functions,  open  up  many  new  algorithmic  directions. 

In  addition,  a reasonable  satisfactory  duality  theory  is  finally 
provided  for  integer  programs.  This  duality  theory  is  based  on 
linear  programming  duality  and  provides  optimality  criteria  for 
an  integer  primal  solution  and  a dual  subadditive  function. 

As  far  as  solving  problems,  the  effort  has  been  redirected 
toward  finding  suitable  subadditive  functions  to  apply  directly 
to  the  original  problem.  Two  classes  of  functions  have  been 
identified  as  being  potentially  useful  for  solving  the  knapsack 
problem.  The  paper  "Subbaditive  Methods  for  the  Knapsack  Prob- 
lem" discusses  these  functions  and  relates  our  methods  to  previ- 
ous methods. 
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3.  The  NSF-CBMS  Regional  Conference 

A series  of  ten  lectures  was  presented  at  an  NSF-CBMS 
regional  conference  at  the  State  University  of  New  York  in 
Buffalo  in  June  1978.  The  lecture  notes  from  that  conference, 
"Integer  Programming:  Facets,  Subadditivity , and  Duality  for 
Group  and  Semigroup  Problems"  will  appear  in  the  SIAM  series. 

The  lectures  detailed  the  extension  of  the  subadditive  function 
approach  to  semigroup  problems  and  related  them  to  the  blocking 
pair  theory  of  Fulkerson. 

4.  Mixed  Integer  Programming 

The  extension  of  the  subadditive  approach  to  the  mixed 
problem  should  be  possible  once  the  gains  made  on  the  pure 
problem  have  been  consolidated  and  put  to  algorithmic  use. 

Summary  of  the  two-year  effort  in  integer  programming 

In  combinatorial  polyhedra,  the  main  results  are  in  the 
paper  "Support  functions,  blocking  pairs,  and  antiblocking 
pairs".  There,  a unified  framework  is  provided  allowing  many 
more  combinatorial  optimization  problems  to  be  viewed  as  either 
blocking  pairs  or  anti-blocking  pairs.  A main  result  is  neces- 
sary and  sufficient  conditions  for  two  polyhedra  to  be,  respec- 
tively, the  blocker  and  antiblocker  of  some  given  polyhedron. 

In  mixed  integer  programming,  the  main  results  are  in  the 
paper  "Duality  and  pricing  in  multiple  choice  right-hand  side 
problems".  There,  subadditive  functions  are  shown  to  provide  an 
adequate  dual  problem  but  not  complete  pricing  information. 
Computational  efforts  for  the  mixed  problem  have  been  delayed  by 
new  results  on  the  pure  problem  as  outlined  below. 

The  subadditive  function  approach  for  pure  problems  has  been 
pushed  forward  in  two  directions.  First,  it  has  been  seen  to  be 
applicable  to  a wide  variety  of  problems,  even  non-Abelian  and 
nonas^ociative  addition  systems.  Secondly,  some  new  functions 
have  been  developed  in  order  to  directly  attack  pure  integer 
problems  without  going  through  the  linear  programming  relaxation. 
This  work  has  been  detailed  for  the  knapsack  problem. 
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